Outputs

  • Dataset: These are the dataset(s) produced by the data preparation phase, which will be used for modeling or the major analysis work of the project.
  • Dataset description: Describe the dataset(s) that will be used for the modeling and the major analysis work of the project.

Data Selection

Decide on the data to be used for analysis. Criteria include relevance to the data mining goals, quality, and technical constraints such as limits on data volume or data types. Note that data selection covers selection of attributes (columns) as well as selection of records (rows) in a table.

Rationale for inclusion/exclusion

  • List the data to be included/excluded and the reasons for these decisions.

In [ ]:


In [ ]: